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PREFACE 


This study was conducted under Task 771QI8, Selection and Classification Technologies. The 
research focuses on the development of procedures and techniques to refine and improve 
measurement devices used in the Air Force operational testing program. 

This work represents an attempt to refine the aptitude indexes of the Armed Services 
Vocational Aptitude Battery (ASV.AB). thereby improving their predictive accuracy and 
consequently the utility of selection measures. This effort supports the subthrust area Assessment of 
Personnel Qualifications, under the major thrust area of Manpower and Force Management. 
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weighting of aptittde components based on differences 

IN TECHNICAL SCHOOL DIFFICULTY 


I. BACkt.KOIND AND INTRODUCTION 

The use of the official aptitude battery (called by various names over the past three decades) for 
selection and classification of Air Force enlisted personnel has always taken the form of computation and 
interpretation of four or more Aptitude Indexes (Als) (Weeks. Mullins. & Vitola. 1975). The use of AIs 
appeared in the first Air Force aptitude battery (AC-1A). It was not administratively feasible in 1948 to 
produce a unique composite score for each Air Force job. but it was assumed that differential aptitude 
composites were desirable. Job clusters were developed on the basis of subjective judgment and job 
analysis data. Through study of test results, scientists formed clusters of tests (AIs) which were reasonably 
homogeneous internally and predictive of success in schools in the separate job clusters. 

During succeeding years, various changes in composition of the AIs have been made, mostly by 
administrative fiat, so that at the present time the current enlisted aptitude battery produces four Air 
Force AIs—Mechanical (M). Administrative (A). General (0). and Electronic (E). Along the way, a great 
deal of research has been done on the enlisted aptitude battery, but few studies questioned the 
effectiveness of the concept of M. A, G, and E aptitude indexes or explored novel ways of weighting 
subtests to produce the M. A. G. and E composites. This study addresses the utility of a different method 
for weighting the M. A. G. and E composites. 

Historically, subtest weighting has been accomplished partly by science and partly by artistry. 
Through various multiple correlational techniques, an optimum weight has been derived within each Air 
Force Specialty for each subtest score against final technical school grade for that specialty. Then the sets 
of weights for specialties have been scrutinized within a particular aptitude area (say. M), looking for a 
minimal set of predictors which consistently exhibit positive non-trivial weights across the entire area. 
When such a set has been found (three or four predictors), the weights are all rounded to 1.0, and again 
multiple correlation coefficients are computed between school grades and these unit-weighted predictor 
variables to see if the validities are holding up after conversion from optimum weights to unit weights. 
Ordinarily, little is lost by converting to unit weighting (see Wainer, 1976). 

One problem, however, has been recognized with this system. Different s bools within each aptitude 
area require different .41 levels to qualify for entry. For example, some A schools require only a score of 
BMli percentile for admission, while others require the 80th percentile. Both schools, however, give grades 
on the same apparent scale, from 70 to 1(H). even though the A80 school is undoubtedly much more 
difficult that the A M) school. Therefore, a final school grade of 82 would refer to a lesser accomplishment 
in the V 10 school than it would in the .480 school. When validities are computed and predictor weights 
assigned across entire aptitude areas regardless of school level (see Figure I). some method is needed for 
adjusting school grades in individual schools upward or downward as a function of the prerequisite levels 

of ability I \ ID. \5<(.V80). Such a method would ensure that graduates of A40 and .480 schools have 

criterion -cores based on the same metric. In short, if it could be done, predictor weights for subtests 
would be more accurate, and Al scores could he computed which would be more efficient than they are 
now. The problem, then, is how to estimate what the school grade of the .480 students would have been if 
they had taken the \ H) course and if there had not been a ceiling score of 100 on school grades. 
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Figure /. Schematic representation of depressing effect of similar criterion 
range on overall validity coefficient computed across 
school requiring different levels of aptitude. 



When restated in this form, the problem almost resolves itself. The solution is to find a constant that 
can be added to the school grades of A80 students to reflect the difference in difficulty between the A40 
and A80 schools. Such a constant should improve the situation in the manner depicted in Figures 1 and 2. 
The computation of this constant requires only that the mean school grade of the A40 school be known 
and that an estimate can be made of the mean school grade the A80 students would have earned if they 
had attended the A40 school and if the 100 score ceiling were removed. Such an estimate ran be made 
reasonably well by computing the best A1 in the A40 school from available predictor information. This A1 
is then used to predict the grades of members of the ABO group. The difference between the mean of the 
observed criterion grades of the A40 group and the mean of the predicted grades of the A80 group 
provides the required constant. This constant is then added to the criterion grade of each subject in the 
A80 group to provide a raw criterion metric so that grades of all students (both A40 and A80) are arranged 
on the same criterion scale. 


The formula to derive the new criterion K score is as follows: 
'j + ( ~f ' ~B ) 


Kj * + (cf - Or) 


where 




the transformed grade score of person j 
the observed grade score of person j 
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(1 = ABO entrance prerequisite 


Figure 2. Schematic representation of higher validity coefficierit 
attainable if different level schools are placed 
on same criterion metric by adding constants. 


cjj = the mean of the composite scores generated for students in the Base group (the group 

in which the prediction composite is generated — i.e.. the AM) group in tile above 
example). 

i -p = the mean of the composite scores generated for subjects in a Target group h) apply ing 

weights developed in the Base group (the Target group is the group to which the 
criterion grade correction will he applied. In the above example. ABO would he the 
Target group). 

VI hen the scores on the k criterion have been computed, the situation depicted in Figure 2 will have 
been achieved, and an adjusted criterion will have become available for use in developing new weights for 
the available predictor variables. The new weights can be used to establish a new aptitude composite 
which may reasonably be expected to predict success throughout the aptitude area, disregarding level, 
better than any set of weights computed in the conventional way. 

Two sets of weights are computed. The first set comes from predicting tile actual grades on just the 
A M) group and is done onlv as an intermediate step to determine the constant used to adjust the grades of 
the \80 group. The second set of weights comes from predicting a combination of the actual grades on the 
\ Ml group and the adjusted grades on the \80 group. This second set of weights defines the new aptitude 









After the new weights (against the K criterion) have lieen established and composite aptitude scores 
have been computed for all students in the study, it is necessary to cheek empirieallv to see whether the 
new composites really do predict actual school grades better than do the old ones. The objective of this 
sillily was to develop new weights lor aptitude composites computed from k-erileriou scores for a sample 
of tlie population and to cross-applv these weights to another sample. 


It. APPROACH 


Sample Population 

The sample consisted of all airmen entering the Air Force between Januarv MITT and September 
MIT*!, on whom subtesl and AI scores on Armed Services Vocational Aptitude Baltcrv (ASVAB) Forms 5. 
0. or T and numerical technical training final school grades were available. School failures were omitted 
front the sample, as well as all subjects in schools where the total number of graduates during this 2-\car 
period was less than ’>0. Total N lor the sample after all neeessarv deletions was 88.Ml'). Of these. 10.715 
were graduates of schools requiring multiple aptitude prerequisites (e.g.. K80 and Mbit), and (>8. 184 were 
from I M> schools with only a single prerequisite (e.g.. \<i0). The 10.715 subjects in schools with multiple 
prerequisites were arbitrarily called \. and all computations and data manipulations applied to the M. the 
\. the G. and the K subjects were also applied to the \ subjects (even though this group consisted of M. A. 
G. and F subjects intermingled). 

I lie subjects in each school were randomly divided equally into a computing (<’) subsample and a 
cross-validation (V ) subsample Then, within each subsample, schools were combined to form the groups 
shown in Table I. 


Table I. Groups by Aptitude Area and by Entrv Level 


(.roup 

N(C+V) 

<rr»up 

N(C+V) 

(>roup 

N(C+V) 

MW 

8.305® 

G40 

3.530 

E50 

1.852 

M50 

8.0T0 

G45 

14.271 a 

EW) 

1.134 

A40 

7.250® 

G50 

154 

E80 

10.322® 

A 50 

224 

GOO 

8.710 

X40 

5.078 

AW) 

2.251 

C65 

121 

X50 

1.260 

A70 

205 

<;«o 

773 

X60 

13.371® 

A 80 

1.204 






3 Base group. All others are Target groups. 


Predictor/( xilerinn Variables 

The following variables were available (or were computed) on each subject: 

1. Technical school final grades, graduates onlv. 

2. ASV VB subtesl score—Numerical Operations 

■V ASV AB subtesl score — Attention to Detail 
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\S\ \B sublesl score—Word knowledge 

\S\ SB suhlcsl score—Arithmetic Reasoning 

ASA SB sulkiest score — Space Perception 

\S\ \B sulkiest scon—Mechanical (ioinpreliension 

\S\ SB sulkiest score —Shop Information 

SSS SB sulkiest score— Sulo Information 

SSS SB sulkiest score — Electronics Information 

SSS SB suhlcsl score — (General lolornialion 

SSS SB suhlcsl score—Mailt knowledge 

SSS SB sulkiest .score — (General Science 

Mechanical SI. as eonventiouallv derived. 1 

Sdniinislralive SI. as eonvenlionallv derived. 1 


General SI. as eonvenlionallv derived. 1 
Fleet roll ic SI. as conventionally derived. 1 

Kducalional variables. These variables were dicboloinous. scored I if the subject bad 
successfiillv eompleled a specified public school course, zero otherwise. 

Prediction composites GIM. Gl S. GIG. GIF. and GIX. computed against the k 

criterion using onlv the SSS SB sublesl scores (S ariables 2—Id). 

Prediction composites G2AI. G2A. G2G. G2F. and G2X. computed against the k 

criterion using the sublesl scores and the educational variables (Sariables 2—Id. 
IK—.“»«>). 


Method 

I lie k criterion was computed in the G subsample of the M Base Group (M M) schools), and applied in 
the Target Groups (only one in this case) of that aptitude area to get the constants for correcting the final 
school grades of each subject so that all members of M schools were placed on the same criterion (the k 
criterion) metric. 

This procedure yielded a single criterion for all members of the M aptitude area, regardless of level. 
I he levels were then combined, and within the M aptitude area, another R J was computed in the G 
subsample: this one between the k criterion and the 12 predictor subtest scores taken as a set. I sing the 
weights emerging from this exercise, a new Mechanical AI score (called G1M) was generated for all 
subjects in all cross-validation subsamples (A. G. E. and X as well as M). This completed the development 
of the G1M composite. The same procedure was repeated in the A. G. F,. and \ groups to generate Gl A. 
GIG. GIF. and GIX for all subjects. 

The procedure described in the previous two paragraphs was repeated, this time using the 12 subtest 
scores plus the 12 educational variables as the set of predictor variables. The prediction composites using 
all these predictors were designated as G.2M. G2A. C2G. G2E. and G2X. 

At this stage, three different sets of Als. or predictor composites, were available for comparison in the 
cross-validation sample: namely, the four composites generated in the traditional way (M. A. G. and E). 
the five (.1 composites generated using the k criterion and the subtest scores onlv (GIM. G1 A. GIG. GIF., 
and (.IX). and the five G2 composites generated against the k criterion using the sublesl scores plus the 


'!»» this the M. -\. li. and F aplilude rnmfxtsites were recomputed and used in ran score (not percentile) form, Conversion 

ileins with \SV \H f» and 7 would not affect the results of the stud). 
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educational variables (<.2M. (12A. (12(1. (12F. and <12\). Validity comparisons were made in the \ 
subsample between the standard Ms and the (.'I and <12 composites to determine whether or not the ( I 
and/or (12 composites improved prediction ol final school grades in individual schools, and if so. bow 
much improvement occurred. 


III. REM ITS AM) DISCI SSION 

Validity coellicients against school grades were computed within each of the I Id schools. The 
uneorrected validities ol the Ms. the (.1 composites, and the <12 composites are shown in Table 2. The 
same validities, corrected for attenuation by selection (Guilford. Idol), formulae IIf. 2d and I V II. p. If fd) 
are shown in Table If. The following observations are obvious from Fable .1: 

1. There is very little dilferenee between the <11 and (12 composites. Validities averaged (using R to 
F isher s / transformation) across all schools were ,.>d for the (II composites and .(>(t lor the (12 composites. 
Fo improve validities by an average ol only .01 is not worth using 12 additional predictor variables (the 
education variables). For the rest ol this report, comparisons will be made onl\ between the conventional 
Ms and the <11 composites. 

2. There is worthwhile improvement, overall, in the predictive efficiency of the (II composites as 
compared with Ms computed in the traditional manner. Vs mentioned in the previous paragraph, the 
overall average validity ol the (11 composite, across all I Id schools, was .."id. Flic average validity of the Als 
across all schools was ..id. It should be noted, however, that the improvement in prediction using the (11 
composites may not be entirely attributable to the new wav of computing the (II composites, using the k 
criterion approach. There were at least two other differences between the formation of the traditional Als 
and the (II composites. First, all VSV VII subtests were used to form the (II composites, whereas only 
selected subsets of subtest scores are used to form the traditional Vis. Second, the subtest scores 
comprising the Als were unit weighted, whereas the (II composite was formed In optimal weighting of all 
12 subtest scores. Kxperience indicates that, in a cross-validation sample, optimal weights produce very 
little more prediction than unit weights and that, at least in most situations involving a large predictor set. 
only a very lew variables have weights siguilicantly different from zero. From a practical standpoint, the 
important fact is that composites computed in the manner of the (II composites are more efficient in 
predicting success ol airmen, for whatever reason. Still, it is important to understand more exactly whv the 
(.1 composites are superior to the Vis computed in the usual manner. V reanalvsis of these data will be 
done to control for the variance which could possibly be introduced by optimal weighting and larger 
predictor sets in forming the (II composites. 

If. I he validities ol (11 VI. (II V. (11(1. (111., and < 11 \ are all vcry similar, regardless of what is being 
predicted, l or example, there is very little advantage in using the (1IM composite to predict success in the 
mechanical area: (II V. <11(1. (1IF. and < 11 \ all do about equally well. This is an interesting finding. It 
seems to argue that success in one area is similar to success in other areas. Vlso. differential prediction by 
test?, ol various ''factors, which most researchers have been pursuing, max be. as VleNetnar suggested 
(VIcNemar. I'Mif). largely illusory. Certainly in this study, where no artificial controls were imposed on 
the selection and weighting of siihlcsl scores, there is little to choose among the (II VI. V. (I. and I'l 
composites, whatever one is predicting. Vvcragc validities of the various composites are given, by aptitude 
area, in Table If. 

I.arger differences appear among the V|. V. C. and F. Vis computed in the traditional manner, 
although the selector composite i« sometimes not the most efficient one: probably because, in the past, 
differences among the Vis were sometimes forced even though some overall validity was lost, liven these 
conventional Vis. lormed in a theoretical framework rationally designed to maximize differential validity, 
are not generally very convincing in substantiating differential prediction a> a practical goal of test 
construction. 
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Table 2. Comparison of l 1 ncor reeled Validities, Three Prediction 
Composites Against Technical School Final Grade 8 * ^ 
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When individual aptitude areas, individual levels, and individual schools are considered, the K 
criterion technique finds even more utility. The average of Cl composite validities for M schools is .55; the 
average conventional M aptitude index is .44 (see Table 3). CIA averages .55 for A schools, whereas the 
average Al-A is only .28. The average C1G (for G schools) is .55, while the average AI-G is .52. Finally, the 
average C1E is .65. compared with an average Al-E of .57. Certainly in the M and A areas, the Cl 
composites are superior to the AI composites. In the G area, the Cl composite is slightly better than the Al. 
and in the Electronic area, the difference is well worthwhile. 

The largest improvement is obviously in the A schools, and a close scrutiny explains why. Of the 13 A 
schools, the Al-A composite yields the least prediction of all the conventional Al composites in nine of 
them (60"„). In fact, in every one of the A schools, the conventional AI-G appears to he a better predictor 
than Al-A. In no other aptitude area is this true. Taking into account that the A schools comprise 11.1 13 
subjects (a very large sample), the development of a new Administrative composite would seem to he 
worthwhile, even if the Als continue to be computed in the conventional way. 

Considering levels within aptitude areas, the Base groups (that group iu which the weights were 
derived which were then applied to the target groups) would he expected to produce higher (3 validities 
than the Target groups because the equations that were instrumental in producing the k criterion were 
derived in the Base groups. If the ('.I validities of the Base group are substantially higher than those of the 
Target groups, more benefit would he expected from using the (3 composite with schools al that level. 
However, the evidence argues the opposite case. 

In the M area, the average Al-M validity for the Base schools is .45 and the average (3-M validity for 
these schools is .54. an increase of .06. In the Target schools (see Table f). tile average Al-M validity is .41 
and the average (3-M is .56. an increase of .15. 


Table I. Improvements in Prediction by C2 Composites. 
Base and Target Schools Compared 
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TargeI 
Schools 

A l(M) 

.45 
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CI(M) 
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.00 
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AI(A) 
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.25 

(3(A) 
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.55 

Difference 

.10 

.30 

41(0 

.10 

.52 

(3(G) 

.12 

.33 

I lifferenee 

.02 

.0.3 

Al (K) 

.37 

.56 

(3(E) 
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In | hr \ area. I hr average M- \ validity i- .21 for lfa-r group «i hoot- and average < I - \ v alidilv i- .1 I. 
a dillrrrnrr id .10. In I hr \ area I’argrl school-. llir average \ I - \ v alidilv i- .21 and average I I - \ v alidilv 
i- .1.1. 

I’-arra Ihl-c -chool- |irndurr an average \l-<i validilv of .10 and average ( l-(. validilv ol 12. an 
improv mini I ol onlv .02. Idle largrl school- prod urr an \l-<« v alidilv of .12 and a < I -(. av rragr v alidilv 
ol . an iinprovrmriil of .02 

Overall. I lie a v crape Vl-f validilv in die Ba.-r gr imp- i- vvhirli inrrra-r- lo .til (a (lit 

iinprovrinrnl) when I hr • ’.1-K composite i- ti-rd. In I lie I. area largrl -chool-. I hr avrrapr \ I -1 v alidilv i- 

compared with .00 lor I hr 0.1-1' composite. an iinprovrinrnl of .10. 

In -uiumarv. llir l.l composite produce- more iinprovrinrnl in llir Tarpel -chool- in rvrrv m-laiirr 
(llioiiph llir rltrrl i- -mall in the (> area). I’lli- mull i- rxarllv llir oppo.-ilr ol prnlii lion-, and llir rra-oii 
I hi- cl feel -lion Id appear i- link now ii. \l anv rale, u-iug llir l!l rompo-ilr- In llir Tarpel -chool- Ira i her 
than llir \l rompo-ilr-) would hr more advanlaproii- than u-inp llirm in llir Ua-r -chool-. 

I here were vrrv larpr dillrrrnrr- anionp individual -chool- in lIn- ainouiil ol predictive 
iinprovrinrnl rffrrlrd hv llir <d rompo-ilr. These difference- rauped from -.12 l-chool 2.21 \l. (.(ill) lo 
4 ..>2 (-chool 722\l. \(>0). I’lirrr wa-no inrrra-r in predictive arcuraev in onlv I2ul llir I 10 -chool-. llir 
validilir- ol l > -chool- improved al lea-1 III when llir (d eouipo-ilc i- -uh-lilulrd lor llir Iradilional \l 
coinpo-ilc-. 21 -rhind- improved al lra-1 .1.1. and 12 improved al lea-1 .211. Tlrarlv. llirrr are iiianv 
indiv idual -chool- in which u-c of llir (d con ipo-ilc could re-ill I in -uh-laulial improvciiiciil in predictive 
rfficiencv. 


I\. COM I.I SIO\S VMI ItKI OMMKMI \ I’lONs 


l lir inlormalimi conlaincd In llii- report lead- to llir lollowiiip com lu-ioii- and rr< oniiiu'iidalion-: 

I. Across all schools, the method of prodiicinp the <d composite virlds rrsull- siihslauliallv heller 
than the traditional method ol produriiip the Al composite. Idle Iradilional approach produces an avrrapr 
validilv ol ..>0. compared with .1') produced hv the ( .1 composites. The difference helwrrn llir square- of 
these validilv riM-fficirnls is .10 (.2.1 minus .21). and the proportional iinprovrinrnl (.10 — .21) i- .10. 
I his last niimher means that, startiup with 21"« predictive rlfirirncv usinp llir convrnlioiial \l-. a fO"n 
iiiiprovrnirnl (raisinp 21% up to .21%) ran hr made hv forminp llir Ms in llir niaiinrr dr-rrihrd for ihr ( d 
composites. If f.’l roinpositrs are used to select lor some lull not all llir schools, much more dramatic 
results niav hr obtained (c.p.. I I f\0. 121\f. 112\0. ,1(i(i\l. tiOIXl. all the \ srlimils. 2(i2\ I. and 
others). I sinp Als computed in the traditional manner lo select for some schools, and (d rompositrs 
romptitrd as in this study to select for others is not a serious problem. The onlv additional procedure 

involved would hi" the computation and recording lor each enlistee of an additional set of eomposili-an 

almost trivial procedure for a computer. 

2. l lir Id couipo-ile- are not -uh-lanliallv improved hv adding educational variable- lo the -el ol 
-iiblc-l predictor-. 

2. Mfiloiigh the primarv objective ol llii--liulv wa- not llir evaluation ol the predicliv e el I icienev ol 
llir conventional aptitude indrvr-. llir \dmiiii-lralive \l. a- eiirrentlv con-liluled. -how- up a- -noli a 
poor -elector lor -chool- in llir Ndmiui-lralive area that llii- liudiug -tumid lie documented. V-crrlaiuiiig 
the validilv ol llii- lindinp -liould lie llir objective ol Inline re-earch on \lt compo-iie- and il 

confirmed, eflorl- -liould lie direcled lo the development ol a new Vdmiiii-lralivc \l lo iucrea-e the 
predictive validilv of llii- rompo-ilr. 


Id 






I. Tin* k-i'iimpn-ili* |irueediire worked r;illl«*r well. It i- a |>rtieedure wliiili should Im* ii-elul mil mill 
ill (hi- eonlexl de.-erilietl lii*r»*in. Iml al-n in aiademii preilit linn -Indio inviili inf! firade |mml averafie- <d 
freshmen. -iiplmmon*-. junior-. and -eiiinr- inllap-ed into a .-infill* criterion group. The proeediire iniild 
al.-o In* ii-i-d in -Indio |irrdii*liiifl raliiifi rrilrria enlleeled on tin* -aim* -rail* mi -ulijert- id differenl rank- 
ami in other -ilualimi- wln*ri* erilerion data an* enlleeled arm— primp- id \ar\inp li*vi*l- mi -ralo 
ri*slrirli*d In arhilran ii|i|n*r nr lower limit-. 
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